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Abstract 

We  describe  a  language  for  defining  term  rewriting  strate¬ 
gies,  and  its  application  to  the  production  of  program  op¬ 
timizers.  Valid  transformations  on  program  terms  can  be 
described  by  a  set  of  rewrite  rules;  rewriting  strategies  are 
used  to  describe  when  and  how  the  various  rules  should  be 
applied  in  order  to  obtain  the  desired  optimization  effects. 
Separating  rules  from  strategies  in  this  fashion  makes  it  eas¬ 
ier  to  reason  about  the  behavior  of  the  optimizer  as  a  whole, 
compared  to  traditional  monolithic  optimizer  implementa¬ 
tions.  We  illustrate  the  expressiveness  of  our  language  by 
using  it  to  describe  a  simple  optimizer  for  an  ML-like  inter¬ 
mediate  representation. 

The  basic  strategy  language  uses  operators  such  as  se¬ 
quential  composition,  choice,  and  recursion  to  build  trans¬ 
formers  from  a  set  of  labeled  unconditional  rewrite  rules. 
We  also  define  an  extended  language  in  which  the  side- 
conditions  and  contextual  rules  that  arise  in  realistic  opti¬ 
mizer  specifications  can  themselves  be  expressed  as  strategy- 
driven  rewrites.  We  show  that  the  features  of  the  basic  and 
extended  languages  can  be  expressed  by  breaking  down  the 
rewrite  rules  into  their  primitive  building  blocks,  namely 
matching  and  building  terms  in  restricted  environments. 
This  primitive  representation  forms  the  basis  of  a  simple 
implementation  that  generates  efficient  C  code. 

1  Introduction 

Compiler  components  such  as  parsers,  pretty-printers  and 
code  generators  are  routinely  produced  using  program  gen¬ 
erators.  The  component  is  specified  in  a  high-level  language 
from  which  the  program  generator  produces  its  implementa¬ 
tion.  Program  optimizers  are  difficult  labor-intensive  com¬ 
ponents  for  which  few  program  generation  techniques  have 
been  developed  to  date. 

A  program  optimizer  transforms  the  source  code  of  a 
program  into  a  program  that  has  the  same  meaning,  but  is 
more  efficient.  On  the  level  of  specification  and  documenta¬ 
tion,  optimizers  are  often  presented  as  a  set  of  correctness- 
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preserving  rewrite  rules  that  transform  code  fragments  into 
equivalent  more  efficient  code  fragments  (e.g.,  see  Table  5). 
Examples  of  optimizers  for  functional  programs  are  dis¬ 
cussed  in  [3,  4,  20].  The  paradigm  provided  by  conventional 
rewrite  engines  is  to  compute  the  normal  form  of  a  program 
with  respect  to  a  set  of  rewrite  rules.  However,  optimiz¬ 
ers  are  usually  not  implemented  in  this  way.  Instead,  an 
algorithm  is  produced  that  implements  a  strategy  for  apply¬ 
ing  the  optimization  rules.  Such  a  strategy  contains  meta¬ 
knowledge  about  the  set  of  rewrite  rules  and  the  program¬ 
ming  language  they  are  applied  to  in  order  to  (1)  guide  the 
application  of  rules;  (2)  guarantee  termination  of  optimiza¬ 
tion;  (3)  make  optimization  more  efficient. 

Such  an  ad-hoc  implementation  of  a  rewriting  system 
has  several  drawbacks,  even  when  implemented  in  a  lan¬ 
guage  with  good  support  for  pattern  matching,  such  as  ML 
or  Haskell.  First  of  all,  the  transformation  rules  are  embed¬ 
ded  in  the  code  of  the  optimizer  making  it  hard  to  under¬ 
stand,  to  maintain,  and  to  reuse  individual  rules  in  other 
transformations.  Furthermore,  the  strategy  is  not  specified 
at  the  same  level  of  abstraction  as  the  transformation  rules, 
making  it  hard  to  reason  about  the  correctness  of  the  opti¬ 
mizer  even  if  the  individual  rules  are  correct. 

It  would  be  desirable  to  apply  term  rewriting  technol¬ 
ogy  directly  to  produce  program  optimizers.  The  standard 
approach  to  rewriting  is  to  provide  a  fixed  strategy  (e  g. 
innermost  or  outermost)  for  normalizing  a  term  with  re¬ 
spect  to  a  set  of  user-defined  rewrite  rules.  This  is  not  sat¬ 
isfactory  when — as  is  usually  the  case  for  optimizers — the 
rewrite  rules  are  neither  confluent  nor  terminating.  A  com¬ 
mon  work-around  is  to  encode  a  strategy  into  the  rules  them¬ 
selves,  e.g.,  by  using  an  explicit  function  symbol  that  con¬ 
trols  where  rewrites  are  allowed.  But  this  approach  has  the 
same  disadvantages  as  the  ad-hoc  implementation  of  rewrit¬ 
ing  described  above:  the  rules  are  hard  to  read,  and  the 
strategies  are  still  expressed  at  a  low  level  of  abstraction. 

In  this  paper  we  argue  that  a  better  solution  is  to  use 
explicit  specification  of  rewriting  strategies.  We  show  how 
program  optimizers  can  be  built  by  means  of  a  set  of  labeled 
rewrite  rules  and  a  user- defined  strategy  for  applying  these 
rules.  In  this  approach  transformation  rules  can  be  defined 
independently  of  any  strategy,  so  the  designer  can  concen¬ 
trate  on  defining  a  set  of  correct  transformation  rules  for  a 
programming  language.  The  transformation  rules  can  then 
be  used  in  many  independent  strategies  that  are  specified 
in  a  formally  defined  strategy  language .  Given  such  a  high- 
level  specification  of  a  program  optimizer,  a  compiler  can 
generate  efficient  code  for  executing  the  optimization  ruies. 
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Starting  with  simple  unconditional  rewrite  rules  as 
atomic  strategies  we  introduce  in  Section  2  the  basic  com- 
binators  for  building  rewriting  strategies.  We  give  examples 
of  strategies  and  define  their  operational  semantics. 

In  Section  3  we  explore  optimization  rules  for  RML  pro¬ 
grams,  an  intermediate  format  for  ML-like  programs  [21]. 
This  example  shows  that  there  is  a  gap  between  the  uncon¬ 
ditional  rewrite  rules  used  in  rewriting  and  the  transforma¬ 
tion  rules  used  for  optimizations.  For  this  reason,  we  enrich 
rewrite  strategies  with  features  such  as  conditions,  contexts 
and  alpha  renaming. 

In  order  to  avoid  complicating  the  language  by  many 
ad-hoc  features,  we  refine  the  strategy  language  in  Section  4 
by  breaking  down  rewrite  rules  into  the  notions  of  matching 
and  building  of  terms.  In  Section  5  we  show  how  this  refined 
language  can  be  used  to  define  rules  with  conditions  and 
contexts.  In  Section  6  we  use  the  resulting  language  to  give 
a  formal  specification  of  the  RML  rules  presented  earlier. 

2  Rewriting  Strategies 

A  rewriting  strategy  is  an  algorithm  for  applying  rewrite 
rules.  In  this  section  we  introduce  the  building  blocks  for 
specifying  such  algorithms  and  give  several  examples  of  their 
application.  The  strategy  language  presented  in  this  section 
is  an  extension  of  previous  work  [16]  of  one  of  the  present 
authors. 

2.1  Terms 

We  will  represent  expressions  in  the  object  language  by 
means  of  first-order  terms.  A  first-order  term  is  a  variable, 
a  constant,  a  tuple  of  one  or  more  terms,  or  an  application 
of  a  constructor  to  one  or  more  terms.  This  is  summarized 
by  the  following  grammar: 

t  x  |  c  |  (ti,  —  ytn)  |  ,tn) 

where  x  represents  variables  (lowercase  identifiers),  c  rep¬ 
resents  constants  (uppercase  identifiers  or  integers)  and  / 
represents  constructors  (uppercase  identifiers).  We  denote 
the  set  of  all  variables  by  X ,  the  set  of  terms  with  variables 
by  T{X)  and  the  set  of  ground  terms  (terms  without  vari¬ 
ables)  by  T.  Terms  can  be  typed  by  means  of  signatures. 
For  simplicity  of  presentation,  we  will  consider  only  untyped 
terms  in  this  paper  until  Section  6. 

2.2  Rewrite  Rules 

The  basis  of  a  strategy  is  a  set  of  labeled  rewrite  rules  of 
the  form  / '  :  /  — >  r,  where  f  is  a  label,  /  and  r  are  first-order 
terms.  For  example,  consider  the  following  rewrite  rules  on 
a  small  language  of  lists  constructed  with  Cons  and  Nil  and 
providing  the  functions  Cone  and  Rev. 

Cncl  :  Conc(Nil,xs)  —¥  xs 

Cnc2  :  Conc(Cons(.T,  xs),  ys)  Cons(x,Conc (xs^ys)) 
Revl  :  Rev(Nil  ,  t/s)  — ►  ys 

Rev2  :  Rev(Cons(x, xs), ys)  ->  Rev(xs,  Cons(x,  ys)) 

The  first  two  rules  define  the  concatenation  of  two  lists.  The 
last  two  rules  define  the  reversal  of  a  list  by  shifting  elements 
of  the  first  list  to  the  second  list  until  the  first  is  empty  and 
the  second  is  the  reversed  list. 


A  rewrite  rule  specifies  a  single  step  transformation  of  a 
term.  For  example,  rule  Cnc2  induces  the  following  trans¬ 
formation: 

Cone (Cons (1,  Nil),  Cons (2,  Nil)) 
-^4Cons(l,  Cone  (Nil,  Cons(2,  Nil))) 

In  general,  a  rewrite  rule  defines  a  labeled  transition  re¬ 
lation  between  terms  and  reducts,  as  formalized  in  the  op¬ 
erational  semantics  in  Table  1.  A  reduct  is  either  a  term  or 
t,  which  denotes  failure.  The  first  rule  defines  that  a  rule  i 
transforms  a  term  t  into  a  term  tf  if  there  exists  a  substitu¬ 
tion  &  mapping  variables  to  terms  such  that  t  is  a  (7-instance 
of  the  left-hand  side  l  and  t '  is  a  cr-instance  of  the  right-hand 
side  r.  The  second  rule  states  that  an  attempt  to  transform 
a  term  t  with  rule  t  fails,  if  there  is  no  substitution  a  such 
that  t  is  a  ^-instance  of  l.  Note  that  a  rewrite  rule  applies 
at  the  root  of  a  term.  Later  on  we  will  introduce  operators 
for  applying  a  rule  to  a  subterm. 


t  ">  t'  if  3a  :  a(l)  =  t  A  a(r)  =  t* 
t  if  “>3(7  :  <?{l)  =  t 


Table  1:  Operational  semantics  for  unconditional  rules. 


2.3  Reduction-Graph  Traversal 

The  reduction  graph  induced  by  a  set  of  rewrite  rules  is  the 
transitive  closure  of  the  single  step  transition  relation.  It 
forms  the  space  of  all  possible  transformations  that  can  be 
performed  with  those  rules. 

For  instance,  one  path  in  the  reduction  graph  induced 
by  the  rules  Revl  and  Rev2  is  the  following: 

Rev (Cons (1,  Cons(2,  Nil)),  Nil) 

-—-4  Rev (Cons (2,  Nil),  Cons(l,  Nil)) 
-^>Rev(Nil,  Cons (2,  Cons(l,  Nil))) 

-^4  Cons (2,  Cons (1 ,  Nil)) 

A  strategy  is  a  compact  description  of  a  subset  of  all  such 
paths.  Rewrite  rules  are  atomic  strategies  that  describe  a 
path  of  length  one.  In  this  section  we  consider  combinators 
for  combining  rules  into  more  complex  strategies.  The  op¬ 
erational  semantics  of  these  strategy  operators  is  defined  in 
Table  2. 

The  fundamental  operation  for  compounding  the  effects 
of  two  transformations  is  the  sequential  composition  si  *  82 
of  two  strategies1 .  It  first  applies  si  and,  if  that  succeeds, 
it  applies  s2 .  For  example,  the  reduction  path  above  is  de¬ 
scribed  by  the  strategy  Rev2  •  Rev2  ■  Revl. 

The  non- deterministic  choice  S1+S2  chooses  between  the 
strategies  si  and  S2  such  that  the  strategy  chosen  succeeds. 
For  instance,  the  strategy  Revl  +  Rev2  applies  either  Revl 
or  Rev2.  Note  that  due  to  this  operator  there  can  be  more 
than  one  way  in  which  a  strategy  can  succeed. 

1The  notation  z  ♦  y  is  derived  from  the  process  algebra  ACP  [6] 
and  should  not  be  confused  with  function  composition. 
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Table  2:  Operational  semantics  for  basic  combinators. 


Strategies  that  repeatedly  apply  some  rules  can  be  de¬ 
fined  using  the  recursion  operator  fix.  s.  One  strategy  for 
the  complete  evaluation  of  an  application  of  Rev  is: 

fix.( Revl  4-  (Rev2  ■  x)) 

It  tries  to  apply  either  rule  Revl  or  Rev2.  In  the  first  case 
(the  first  argument  is  Nil)  evaluation  is  done.  In  the  sec¬ 
ond  case,  the  entire  strategy  is  invoked  again  through  the 
recursion  variable  x.  This  strategy  will  only  succeed  if  it  can 
terminate  with  an  application  of  Revl. 

With  the  non-deterministic  choice  operator  the  program¬ 
mer  has  no  control  over  which  strategy  is  chosen.  The  deter¬ 
ministic  or  left  choice  operator  s\  <3-  S2  is  biased  to  choose 
its  left  argument  first.  It  will  consider  the  second  strategy 
only  if  there  is  no  way  in  which  the  first  can  succeed.  This 
operator  can  be  used  to  optimize  the  strategy  for  evaluating 
list  reversals.  The  strategy 

/iX.((Rev2  •  e)  Revl) 

always  first  tries  to  apply  rule  Rev2  before  it  considers  Revl. 

The  identity  strategy  e  always  succeeds.  It  is  often  used 
in  conjunction  with  left  choice  to  build  an  optional  strategy: 
s  <3-  e  tries  to  apply  s,  but  when  that  fails  just  succeeds  with 
e.  The  failure  strategy  S  is  the  dual  of  identity  and  always 
fails. 

The  strategy  test  s  can  be  used  to  test  whether  a  strat¬ 
egy  s  would  succeed  or  fail  without  having  the  transforming 
effect  of  s.  The  negation  -«s  of  a  strategy  s  is  similar  to 
test,  but  tests  for  failure  of  s.  We  will  see  examples  of  the 
application  of  these  operators  in  Section  6. 

Redex  and  Normal  Form  We  will  call  a  term  an  £-redex 
if  it  can  be  transformed  with  a  rule  £ ,  otherwise  it  is  in  £- 
normal  form.  We  will  generalize  this  terminology  to  general 


strategies,  i.e.  if  t  -2-*  tl ,  then  t  is  an  s-redex  and  if  t  -2*  f> 
then  t  is  in  s- normal  form. 

Strategy  Definitions  In  order  to  name  common  patterns 
of  strategies  we  will  use  strategy  definitions.  A  definition 
f(x i, ...  ,xn)  =  s  introduces  a  new  n-ary  strategy  operator 
/.  An  application  f(s i, ...  ,s„)  of  /  to  n  strategies  denotes 
the  instantiation  s[x\  :=  s\  ...xn  :=  sn]  of  the  body  of 
the  definition.  Strategy  definitions  are  not  recursive  and 
not  higher-order,  i.e.  it  is  not  possible  to  give  a  strategy 
operator  as  argument  to  a  strategy  operator.  An  example  of 
a  common  pattern  is  the  application  of  a  strategy  to  a  term 
as  often  as  possible.  This  is  expressed  by  the  definitions 

repeat (s)  =  fix.((s  •  x)  <3-  e) 
repeat l(s)  =  s  •  repeat  (s) 

The  strategy  repeat  (s)  applies  s  zero  or  more  times,  but  as 
often  as  possible.  The  strategy  repeat l(s)  applies  .s  one 
or  more  times,  but  as  often  as  possible.  Using  repeat, 
yet  another  way  of  evaluating  the  application  of  Rev  is 
the  strategy  repeat(Rev2)  •  Revl  which  is  equivalent  to 
fix.((Rev2  •  x)  <4-  e)  •  Revl. 

Backtracking  As  we  remarked  before,  the  non- 
deterministic  choice  operator  a  +  b  leads  to  more  than  one 
transformation  when  both  strategies  si  and  82  are  applica¬ 
ble  to  a  term.  This  leads  to  the  possibility  of  backtracking. 
Consider  the  strategy  (si  -f  $2)  *53.  If  both  s\  and  S2  apply 
to  a  term  f,  say  we  have  t  -21*  t*  and  t  -22*  but  S3  fails 
for  tf  and  succeeds  for  f",  i.e.  t '  -22*  |  and  t"  -22*  then 

we  get  as  result  t  ^1+32)'S3  >  int #  \n  other  words,  regardless 
of  the  order  in  which  s\  and  *■_>  are  tried,  n  succeeding  one 
will  be  chosen.  Or,  in  more  operational  terms,  if  a  choice 
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Table  3:  Operational  semantics  for  term  traversal  operators. 


made  at  some  point  leads  to  failure  later  on,  the  strategy 
will  backtrack  to  the  choicepoint.  This  does  not  hold  for 
left  choice.  Once  the  left  branch  has  succeeded  the  right 
branch  can  never  be  chosen.  Therefore,  left  choice  provides 
the  means  to  define  deterministic  strategies  without  the 
global  backtracking  behaviour  described  above. 

2.4  Term  Traversal 

The  operators  we  introduced  above  apply  strategies  to  the 
root  of  a  term.  This  is  not  adequate  for  achieving  all  trans¬ 
formations.  For  instance,  for  evaluating  an  application  of 
Cone  with  rules  Cncl  and  Cnc2,  we  need  to  apply  rules  to 
subterms  of  the  root.  For  example,  if  we  continue  the  re¬ 
duction  of  the  concatenation  we  started  before  we  get  the 
reduction  path: 

Cone (Cons (1,  Nil),  Cons (2,  Nil)) 

-£££!*  Cons  (1,  Cone  (Nil ,  Cons(2,  Nil))) 

Cons ( 1 ,  Cons (2,  Nil)) 

The  second  step  in  this  reduction  is  an  application  of  rule 
Cncl  to  the  second  argument  of  the  Cons. 

In  order  to  apply  rewrite  rules  below  the  root  of  a  term, 
i.e.  to  the  subterms  of  a  term,  we  need  operators  to  traverse 
the  tree  structure  of  a  term.  For  this  purpose  we  introduce 
four  new  operators.  The  operational  semantics  of  these  op¬ 
erators  is  defined  in  Table  3. 

The  fundamental  operation  for  term  traversal  is  the  ap¬ 
plication  of  a  strategy  to  a  specific  direct  subterm  of  a  term. 
The  strategy  i(s)  applies  strategy  s  to  the  i- th  child.  Using 
this  operator  an  arbitrary  path  in  a  term  can  be  constructed. 
We  saw  an  example  above,  2(Cncl)  applies  rule  Cncl  to  the 
second  argument  of  the  root. 

The  congruence  operator  /(si, . . .  ,  sn)  is  a  strategy  that 
specifies  a  strategy  to  be  applied  to  each  direct  subterm  of 
a  term  with  constructor  /.  Instead  of  2(Cncl)  we  could  use 
the  congruence  operator  Cons (e,  Cncl)  to  apply  rule  Cncl  to 
the  second  argument  of  a  Cons  term.  Using  this  idea  we  can 
now  construct  a  strategy  for  evaluating  the  concatenation  of 


two  lists: 

fix.( Cncl  +  Cnc2  •  Cons(e,  x)) 

The  strategy  repeatedly  applies  rule  Cnc2  and  then  termi¬ 
nates  with  rule  Cncl.  In  the  first  case  the  strategy  is  recur¬ 
sively  applied  to  the  Cone  in  the  second  argument  of  Cons. 

A  more  general  example  of  the  use  of  congruence  opera¬ 
tors  is  the  strategy  map(s)  that  applies  a  strategy  s  to  each 
element  of  a  list: 

map(s)  =  fix,  (Nil  +  Cons(s,  x)) 

The  path  and  congruence  operators  are  useful  for  con¬ 
structing  strategies  for  a  specific  data  structure.  To  con¬ 
struct  more  general  strategies  that  can  abstract  from  a  con¬ 
crete  representation  we  introduce  the  operators  □($)  and 

0(a). 

The  strategy  O(s)  applies  s  to  each  direct  subterm  of 
the  root.  This  only  succeeds  if  s  succeeds  for  each  direct 
subterm.  In  case  of  constants,  i.e.  constructors  without 
arguments,  the  strategy  always  succeeds,  since  there  are  no 
direct  subterms.  This  allows  us  to  define  very  general  traver¬ 
sal  strategies.  For  example,  the  following  strategies  apply  a 
strategy  s  to  each  node  in  a  term,  in  preorder  (top-down), 
postorder  (bottom-up)  and  a  combination  of  pre-  and  pos¬ 
torder  (downup): 

topdown(s)  =  fix.(s  •  □(&)) 
bottomup(s)  =  fix.(0(x)  •  $) 
downup(s)  =  fix.(s  •  □  (£)  •  s) 

The  strategy  0(s)  applies  s  non-deterministically  to  one 
direct  subterm.  It  fails  if  there  is  no  subterm  for  which  it 
succeeds.  In  particular,  it  fails  for  constants,  since  they  have 
no  child  for  which  s  can  succeed.  As  we  did  with  □  we  can 
construct  bottom-up  and  top-down  traversals  with  O: 

oncetd(s)  =  fix.(s  <3-  O(x)) 
oncebu(s)  =  fix.{0(x)  s) 

These  strategies  succeed  if  they  find  an  s-redex  as  subterm. 
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These  strategies  perform  a  fixed  traversal  over  a  term. 
A  normalization  strategy  for  a  strategy  $  keeps  traversing 
the  term  until  it  finds  no  more  s-redexes.  Examples  of  well- 
known  normalization  strategies  are  reduce,  which  repeat¬ 
edly  finds  a  redex  somewhere  in  the  term,  outermost,  which 
repeatedly  finds  a  redex  starting  from  the  root  of  the  term 
and  innermost,  which  looks  for  redexes  from  the  leafs  of  the 
term.  Their  definitions  are: 

reduce(s)  =  repeat  (jix.(0(x)  4-  s)) 
outermost  (s)  =  repeat  (oncetd(s)) 
innermost  (s)  =  repeat  (oncebu(s)) 

Note  that  this  definition  of  innermost  reduction  is  not 
very  efficient.  After  finding  a  redex,  search  for  the  next 
redex  starts  at  the  root  again.  A  more  efficient  definition  of 
innermost  reduction  is  the  following. 

innermost' (s)  =  fxx.(D(x)  •  ($  •  x  <4-  e)) 

It  first  normalizes  all  subterms  (□(x)),  i.e.  all  subterms  are 
in  s-normal-form.  Then  it  tries  to  apply  s  at  the  root.  If  that 
fails  this  means  the  term  is  in  s-normal-form  and  normal¬ 
ization  terminates  with  e.  Otherwise,  the  reduct  resulting 
from  applying  s  is  normalized  again. 

Finally,  <3 {s)  is  a  parallel  (greedy)  version  of  O(s)  that  is 
defined  by 


4(s)  =  test(0(s))  •  D(s  4  e) 

The  operator  is  a  hybrid  between  D(s)  and  O(s).  It  is  like 
O  because  it  has  to  succeed  for  at  least  one  child  and  it  is 
like  □  because  it  applies  to  all  children.  The  difference  with 
□  is  that  it  does  not  have  to  succeed  for  all  children.  An 
application  of  <S>  is  the  strategy 

somebu(s)  =  /zx.((4(x)  •  ( s  e))  <4  s) 

that  applies  s  bottom-up  at  least  once  somewhere  in  the 
term,  but  as  often  as  possible. 

3  Case  Study:  RML  Optimizer 

RML  [21]  is  a  strict  functional  language,  essentially  similar 
to  the  core  of  Standard  ML  [18]  with  a  few  restrictions.  In 
this  paper  we  consider  a  subset  of  RML  that  includes  ba¬ 
sic  features  of  functional  languages,  namely  basic  constants 
(integer,  boolean,  etc.)  and  primitive  built-in  functions,  tu¬ 
ples  and  selection,  let-bindings  and  mutually  recursive  func¬ 
tions.  Programs  are  pre-processed  by  the  compiler  of  RML 
to  A-normal  form.  The  syntax  of  this  restriction  of  RML  is 
presented  in  Table  4. 

Table  5  describes  a  set  of  meaning  preserving  source-to- 
source  transformation  rules  for  RML.  For  in-depth  discus¬ 
sions  of  the  intent  and  correctness  of  these  rules  we  refer 
the  reader  to  the  literature  on  transformation  of  functional 
programs,  e.g.  [3,  4,  12,  20].  The  rules  in  Table  5  were  in¬ 
spired  by  the  high-level  rules  presented  in  [4].  In  the  sequel, 
we  concentrate  on  the  details  of  the  implementation  of  these 
rules. 

It  might  seem  straightforward  to  implement  these  rules 
by  a  rewriting  system  using  the  strategy  combinators  intro¬ 
duced  in  the  previous  section.  Unfortunately,  this  is  not 
the  case!  There  is  a  gap  between  these  transformation  rules 
and  the  simple  rewrite  rules  defined  above.  Only  (Hoist  1) 


t 

se 

fdec 

vdec 

e 


b  1 1  1 1 1\  *  •  ■  •  *  t 

x  |  c 

f  :  t  X\ , . . . ,  Xu  —  e 
x  :  t  =  e 
se 


(Types) 
(Simple  expressions) 
(Function  declarations) 
(Variable  bindings) 
(Expressions) 


|  x(sei, . . . ,  sen) 

|  d(sei, . . . , sen) 

|  (sei,...,sen) 

|  select(i,  se) 

|  let  vdec  in  e 

|  letrec  fdec\  •  ■  •  fdecn  in  e 


where  x, /, /i,...  range  over  variables,  c  over  constants, 
and  d  over  primitive  built-in  functions,  i  over  integers, 
e,  e\ , . . .  over  expressions,  b  over  basic  types,  and  t,t i , . . . 
over  types. 


Table  4:  Syntax  of  RML 


and  (Hoist2)  conform  to  the  format.  All  the  other  rules  use 
features  that  are  not  provided  by  basic  rewrite  systems. 

(Deadl)  and  (Dead2)  are  conditional  rewrite  rules  that 
remove  pieces  of  dead  code.  The  condition  (Deadl)  tests 
whether  the  variable  defined  by  the  let  occurs  in  the  body 
of  the  let.  The  condition  of  (Dead2)  tests  whether  any  of  the 
functions  defined  in  the  list  of  function  declarations  occurs 
in  the  body.  (Prop)  and  (Inline)  require  substitution  of  free 
occurrences  of  a  variable  by  an  expression.  (Inline)  uses 
simultaneous  substitution  of  a  list  of  expressions  for  a  list  of 
variables.  In  addition,  it  is  a  context-sensitive  rule,  replacing 
an  application  of  the  function  /  somewhere  in  the  expression 
e  by  the  body  of  the  function.  This  is  expressed  by  the 
use  of  a  context  e[f(es)}.  Furthermore,  the  rule  renames 
all  occurrences  of  bound  variables  with  fresh  variables,  to 
preserve  the  invariant  that  all  bound  variables  are  distinct. 
This  invariant  simplifies  substitution  and  testing  for  variable 
occurrence  in  an  expression.  Finally,  (Etaexp)  generates 
fresh  variables ,  which  is  a  global  condition  on  the  whole 
term. 

4  Refining  the  Strategy  Language 

The  RML  example  shows  that  simple  unconditional  rules 
lack  the  expressivity  to  describe  optimization  rules  for  pro¬ 
gramming  languages  and  that  we  need  enriched  rewrite  rules 
with  features  such  as  side  conditions  and  contexts  and  sup¬ 
port  for  alpha  renaming  and  substitution  of  object  variables. 
For  other  applications  we  might  need  other  features  such  as 
list  matching  and  matching  modulo  associativity  and  com¬ 
mutativity.  Adding  each  of  these  features  as  an  ad-hoc  ex¬ 
tension  of  basic  rewrite  rules  would  make  the  language  dif¬ 
ficult  to  implement  and  maintain.  It  would  be  desirable  to 
find  a  more  uniform  method  to  deal  with  such  extensions. 

If  we  take  a  closer  look  at  the  features  discussed  above, 
we  observe  that  they  all  have  strategy- like  behaviour.  For 
instance,  a  rule  with  a  context  c[l']  in  the  left-hand  side 
and  c[r']  in  the  right-hand  side  can  be  seen  as  performing  a 
traversal  over  the  subterm  matching  c  applying  rule  /'  -» 
Therefore,  instead  of  creating  more  complex  primitives  such 
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let  v  :  t  =  let  vdec  in  ei  in  62 
let  v  :  t  =  letrec  fdecs  in  e\  in  e2 
let  v  :  £  =  ei  in  e2 

letrec  fdecs  in  e 

let  v  :  t  =  se  in  e 
letrec  f  :t,  xs  —  e  in  e[/(es)] 

let  x  :  t  =  ($ei,  •  •  •  ,  sen)  in  e[select(i, a;)] 

let  /  :  ts  -4  t  =  ei  in  e2 


let  vcfec  in  let  v  :  t  =  ei  in  e2 
letrec  fdecs  in  let  v  :  t  =  ei  in  e2 


e2 


if  x  £  uars(e2)  and  ei  is  pure 


if  for  all  /  :  t  xs  =  e  in  fdecs:  f  &  fv(e) 

>  let  v  :  t  =  se  in  e{se/t>} 

►  letrec  f  :t  xs  =  e  in  e[rename(e,{s5/xs})] 
if  {f  &  ss  U  /u(e[]))  or  (e'  is  small) 

►  let  x  :  t  =  (sei,  •  •  •  ,  sen)  in  e[sa] 
if  /'  and  xs  are  fresh  variables 

>  letrec  /  :  ts  — »  t  xs  =  let  f  :  ts  — >  t  =  ei  in  /;  xs  in  e2 
if  /'  and  xs  are  fresh  variables  and  t\  is  pure 


(Hoist  1) 
(Hoist2) 
(Deadl) 

(Dead2) 

(Prop) 

(Inline) 

(Select) 

(Etaexp) 


Table  5:  Transformation  rules  for  RML 


as  rules  with  contexts,  we  break  down  rewrite  rules  into 
their  primitives:  matching  against  term  patterns  and  build - 
ing  terms.  Using  these  primitives  we  can  implement  a  wide 
range  of  features  in  the  strategy  language  itself  by  translat¬ 
ing  rules  which  use  those  features  to  strategy  expressions. 

Match,  Build  and  Scope  We  first  need  to  define  the 
semantics  of  matching  and  building  terms.  A  rewrite  rule 
f  :  l  -4  r  first  matches  the  term  against  the  left-hand  side 
/  with  as  result  a  binding  of  subterms  to  the  variables  in 
/.  Subsequently  it  builds  a  new  term  by  instantiating  the 
right-hand  side  r  with  those  variable  bindings.  By  intro¬ 
ducing  the  new  strategy  primitives  match  and  build  we  can 
break  down  t  into  a  strategy  match(Z)  *  build(r).  However, 
this  requires  that  we  carry  the  bindings  obtained  by  match 
over  the  sequential  composition  to  build.  For  this  reason, 
we  introduce  the  notion  of  environments  explicitly  in  the 
semantics. 

An  environment  £  is  a  mapping  of  variables  to  ground 
terms.  We  denote  the  instantiation  of  a  term  t  by  an  en¬ 
vironment  £  by  £(t).  An  environment  £'  is  an  extension 
of  environment  £  (notation  £'  □  £)  if  for  each  x  6  dom(£) 
we  have  £,(x)  =  £(x).  An  environment  £'  is  the  smallest 
extension  of  £  with  respect  to  a  term  t  (notation  £'  £)> 

if  £'  3  £  and  if  dom(£;)  =  dom(£)  U  vars(<). 

Now  we  can  formally  define  the  semantics  of  match  and 
build.  We  extend  the  reduction  relation  — from  a  relation 
between  terms  and  reducts  to  a  relation  on  pairs  of  terms 
and  environments,  i.e.  a  strategy  s  transforms  a  term  t 
and  an  environment  £  into  a  transformed  term  £  and  an 
extended  environment  £',  denoted  by  t  :  £  t'  :  £',  or 
fails,  denoted  by  t  :  £  t-  The  operational  semantics  of 
the  environment  operators  is  defined  in  Table  6. 

Once  a  variable  is  bound  it  cannot  be  rebound  to  a  dif¬ 
ferent  term.  To  use  a  rule  more  than  once  we  introduce 
variable  scopes.  A  scope  {x  :  s}  locally  undefines  the  vari¬ 
ables  x.  The  notation  £/x  denotes  £  without  bindings  for 
variables  in  x.  £|x  denotes  £  restricted  to  x. 

We  have  changed  the  format  of  the  operational  seman¬ 
tics.  Therefore,  we  should  change  all  rules  in  Tables  2  and  3 


as  follows:  replace  each  t  — ^  tf  by  t :  £  — ^  tf  :  £' 

5  Implementation  of  Transformation  Rules 

We  now  have  a  strategy  language  that  consists  of  match  and 
build  as  atomic  strategies  (instead  of  rewrite  rules)  and  all 
the  combinators  introduced  in  Section  2.  Using  this  refined 
strategy  language,  we  can  implement  transformation  rules 
by  translating  them  to  strategy  expressions.  In  this  higher- 
level  view  of  strategies  we  can  use  both  the  ‘low-level’  fea¬ 
tures  match,  build  and  scope  and  the  ‘high-level’  features 
such  as  contexts  and  conditions.  We  start  by  defining  the 
meaning  of  unconditional  rewrite  rules  in  terms  of  our  re¬ 
fined  strategy  language. 

5.1  Unconditional  Rewrite  Rules  Revisited 

A  labeled  rewrite  rule  t  :  l  — ►  r  translates  to  a  strategy 
definition 

£  s=  {vars(Z,  r)  :  match(Z)  *  build(r)} 

It  introduces  a  local  scope  for  the  variables  used  in  the  rule 
vars (Z,  r),  matches  the  term  against  l  and  then  builds  r  using 
the  binding  obtained  by  matching. 

5.2  Subcomputation 

Many  transformation  rules  require  a  sub  computation  in  or¬ 
der  to  achieve  the  transformation  from  left-hand  side  to 
right-hand  side.  For  instance,  the  inlining  rule  in  Table  5 
applies  a  substitution  and  a  renaming  to  an  expression  in 
the  right-hand  side. 

Where  The  where  clause  is  the  basic  extension  to  rewrite 
rules  to  achieve  subcomputations.  A  rule 

£  :l  — ►  r  where  s 

corresponds  to  the  strategy 

£  =  {vars(Z,  r,  s)  :  match(Z)  •  s  •  build(r)} 
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t :  £  t :  £'  if  £'  3e-  £  A  £'(t')  -  t 

t :  £  “teh(t'>>l-  if -,3£' £  A  £'(t')-t 

t :  £  £{t') :  £  if  vars(t')  C  dom(£ ) 

t  :  £  bulld^  t  if  vars(^)  £  dom(£) 

t:£/Z^->t' :£' 

t:£/S-^  t 

t:£  :£'/xV£\x 

t:£-^h  T 

(a)  positive  rules 

(b)  negative  rules 

Table  6:  Operational  semantics  for  environment 


that  first  matches  /,  then  executes  s  and  finally  builds  r. 
The  strategy  s  can  be  any  strategy  that  affects  the  environ¬ 
ment  in  order  to  bind  variables  used  in  r.  Note  that  s  can 
transform  the  original  term,  but  the  effect  of  this  is  can¬ 
celed  by  the  subsequent  build.  Only  the  side-effect  of  s  on 
the  environment  matters. 

Matching  Condition  A  frequently  occuring  subcompu¬ 
tation  is  to  apply  a  strategy  to  a  term  built  with  variables 
from  the  left-hand  side  l  and  match  the  result  against  a 
pattern  with  variables  used  in  the  right-hand  side  r.  The 
notation 

(s)  t^t* 

corresponds  to  the  strategy 

build(t)  •  s  •  match(f') 

It  first  evaluates  t  with  respect  to  s  then  matches  the  result 
(if  it  succeeded)  against  tf  with  as  result  a  side-effect  on  the 
variables  in  t'.  If  the  match  to  t*  is  not  needed,  then  (s)  t 
can  be  used  either  to  get  the  side-effect  of  s  or  to  only  test 
for  success  of  s. 

Application  in  Right-hand  Side  Often  it  is  annoying 
to  introduce  an  intermediate  name  for  the  result  of  applying 
a  strategy  to  a  subterm  of  the  right-hand  side.  Therefore, 
the  application  (s)  t  can  be  used  directly  in  the  right-hand 
side  r.  That  is,  a  rule 

i  :  l  r{(s)  t} 

is  an  abbreviation  of 

£  :  l  -y  r{z}  where  (s)  t  =*►  x 

where  x  is  a  new  variable  and  r{($)i}  denotes  a  meta¬ 
context,  i.e.  a  term  with  an  occurrence  of  ( s )  t. 

Conditions  Conditions  that  check  whether  some  predi¬ 
cate  holds  can  also  be  be  seen  as  subcomputations.  We 
implement  these  conditions  as  strategies  using  the  where 
clause.  Failure  of  such  a  strategy  means  that  the  condition 
does  not  hold,  while  a  success  means  that  it  does  hold.  Pred¬ 
icates  are  user-defined  strategy  operators.  For  instance,  to 
test  that  ti  is  a  subterm  of  £2  the  condition  (in)  (t  1 ,  <2)  can 
be  used.  The  predicate  in  is  defined  as 

in  =  {ti,t2  :  match((ti, *2))  •  (oncetd(match(£i)))  *2} 


Conditions  can  be  combined  by  means  of  the  strategy  com- 
binators.  In  particular,  conjunction  of  conditions  is  ex¬ 
pressed  by  means  of  sequential  composition  and  disjunction 
by  means  of  non-deterministic  choice. 

5.3  Contexts 

A  useful  class  of  rules  are  those  whose  left-hand  sides  do  not 
match  a  fixed  pattern  but  match  a  top  pattern  and  some  in¬ 
ner  patterns  which  occur  in  contexts.  For  instance,  consider 
the  (Inline)  and  (Select)  rules  in  Table  5.  Contexts  can  also 
be  implemented  with  the  where  clause.  A  rule 

£:J{c[0}“>r{c[r']} 

with  one  context  c\\  occurring  in  the  left-hand  side  and  right- 
hand  side  corresponds  to  the  rule 

£  :  l{x }  — >  r{x/} 

where  (oncetd  ({  vars(/',  r')/  vars(J,  r)  :  (/'  -y  r')}))  x  =>  x' 

where  x  and  x 1  are  fresh  variables.  The  notation  (l*  —yr) 
is  an  abbreviation  for  match(//)  ■  build (r;)  and  is  used  to 
inline  a  rule  in  a  strategy.  The  strategy  in  the  where  clause 
traverses  the  subterm  matching  the  x  (using  oncetd)  to  find 
one  occurrence  of  /'  and  replaces  it  with  r  .  The  result  of 
the  traversal  is  assigned  to  x\  which  is  then  used  in  the 
right-hand  side  of  the  rule.  Note  that  we  scope  locally  the 
variables  of  l’  and  r  except  those  common  to  the  variables 
of  l  and  r,  since  they  are  bound  by  the  matching  of  1. 

The  implementation  of  the  rule  above  replaces  exactly 
one  occurrence  of  V  in  the  redex  due  to  the  strategy  oncetd. 
To  replace  all  occurrences  of  /'  in  the  context,  we  have  de¬ 
fined  a  greedy  context,  written  e[|£|].  The  implementation  of 
this  context  is  the  same  as  the  contexts  above,  except  that 
the  traversal  strategy  sometd  is  used  instead  of  oncetd. 

5.4  Alpha  Renaming 

An  important  feature  of  program  manipulation  is  bound 
variable  renaming.  A  major  requirement  is  to  provide  re¬ 
naming  as  an  object  language  independent  operation.  This 
means  that  the  designer  should  indicate  the  binding  con¬ 
structs  of  the  language.  This  is  done  by  mapping  each  bind¬ 
ing  construct  to  the  list  of  variables  that  it  binds.  For  ex¬ 
ample,  for  the  Let  construct,  the  rule 

Bindl  :  Let(Vdec(i, «,  e),  e')  ->  [v]\ 

gives  the  binding  variable  v  (see  Appendix  A  for  the  other 
rules).  Given  these  rules  the  strategy  rename  renames  all 
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bound  object  variables  in  the  term  to  which  it  is  applied.  It 
is  defined  using  the  strategy  language  (see  the  definition  in 
Appendix  D).  This  strategy  uses  the  built-in  strategy  new 
which  generates  fresh  names . 

6  Rules  and  Strategy  for  RML 

Rules  Table  7  presents  the  specification  of  RML  optimiza¬ 
tion.  It  consists  of  a  signature,  rewrite  rules  and  strategy 
definitions.  The  signature  allows  us  to  statically  check  the 
rules,  strategies  and  input  term.  Because  the  input  of  the 
optimizer  are  programs  in  abstract  syntax  we  use  the  ab¬ 
stract  syntax  of  RML  programs  instead  of  concrete  syntax. 

One  benefit  of  rewrite  strategies  is  that  the  specification 
of  RML  is  almost  similar  to  the  high-level  rules  presented 
in  Table  5.  There  are  very  few  changes,  namely  the  use  of 
greedy  context  for  efficiency  considerations,  and  the  bind 
rules  in  Appendix  A. 

Strategies  Another  important  advantage  of  our  approach 
is  the  ability  to  experiment  and  reason  easily  with  strate¬ 
gies,  which  are  generally  heuristic.  We  present  two  pos¬ 
sible  strategies:  optimizel  and  optimize2.  No  change 
of  rules  is  required.  A  separation  of  strategies  from  rules 
prevents  many  mistakes  and  enables  us  to  reason  on  their 
property  such  as  termination.  For  instance  in  optimizel 
and  optimize2,  we  have  avoided  to  apply  EtaExp  repeat¬ 
edly  since  this  rule  is  not  terminating.  Both  optimizel 
and  optimize2  first  apply  EtaExp  once  everywhere  in  the 
term.  The  strategy  optimizel  uses  the  generic  strategies 
innermost  *  and  somedownup  (see  Appendix  B)  to  apply  the 
rest  of  these  rules.  The  strategy  somedownup  is  a  variant  of 
sometd  that  applies  a  strategy  s  at  all  positions  of  a  term. 
It  fails  when  none  of  these  applications  succeed.  If  it  it  suc¬ 
ceeds  we  know  that  some  redex  has  been  reduced.  Hence, 
we  can  repeat  oncedownup  to  normalize  a  term. 

While  optimizel  uses  generic  strategies,  optimize2  per¬ 
forms  specific  analyses  to  apply  rules.  It  first  tries  to  hoist  a 
Let  at  the  root.  Notice  that  it  repeats  Hoist  1  since  it  may 
reapplv  at  the  root,  whereas  Hoist 2  cannot  reapply  after 
uiir  iipplu  at  ion  Then,  only  Let  or  Letrec  expressions  can 
be  redoxes.  For  each  case  there  are  specific  rules  that  can 
apply.  This  leads  us  to  define  a  sub-strategy  for  each  case 
and  compose  them  non-deterministically.  In  both  cases  we 
first  normalize  the  body  of  the  Let  or  Letrec  expression. 
For  a  Let  we  try  the  rules  Prop  and  Sel  and  then  Deadl. 
For  a  Letrec,  we  first  normalize  the  bodies  of  the  functions 
of  the  Letrec  expression.  Then  we  try  Inll  or  Inl2  and  if 
they  succeed  we  try  Dead2.  Since  inlining  gives  rise  to  new 
opportunities  for  optimization,  we  retry  to  strategy  to  this 
term. 

7  Implementation 

The  strategy  language  presented  in  this  paper  has  been  im¬ 
plemented  in  SML.  The  programming  environment  consists 
of  a  simple  interactive  shell  that  can  be  used  to  load  spec¬ 
ifications  and  terms,  to  apply  strategies  to  terms  using  an 
interpreter  and  to  inspect  the  result.  A  simple  inclusion 
mechanism  is  provided  for  modularizing  specifications.  The 
current  implementation  does  not  yet  implement  the  sort 
checking  for  rules  and  strategies.  In  addition  to  an  inter¬ 
preter  the  environment  contains  a  compiler.  It  compiles  a 


strategy  to  a  C  program  that  transforms  terms  according  to 
the  strategy. 

The  compilation  of  non-deterministic  strategies  is  remi¬ 
niscent  of  the  implementation  of  Prolog  in  WAM  [1]  using 
success  and  failure  continuations  and  a  stack  of  choicepoints 
to  implement  full  backtracking.  A  difference  with  WAM  is 
that  our  implementation  deals  with  choicepoints  occuring 
inside  a  traversal  as  in  the  strategy  D(si  +  $2)  *  «3- 

The  run-time  environment  of  compiled  strategies  is  based 
on  the  ATerm  C-library  [19].  It  provides  functionality  for 
building  and  manipulating  a  term  data-structure,  reference 
count  garbage  collection,  a  parser  and  pretty-printer  for 
terms.  An  important  feature  is  that  full  sharing  of  terms 
is  maintained  (hash-consing)  to  reduce  memory  usage. 

We  have  used  the  implementation  to  experiment  with 
the  optimizer  for  RML  discussed  in  this  paper,  but  more 
work  is  needed  before  we  can  present  performance  results. 
The  strategy  language  provides  many  opportunities  for  op¬ 
timization.  We  plan  to  apply  our  technique  to  optimizing 
strategies. 

8  Related  Work 

Program  Optimization  There  have  been  many  at¬ 
tempts  to  build  frameworks  for  program  analysis  and  opti¬ 
mization,  often  using  special-purpose  formalisms.  Systems 
close  to  ours  in  spirit  include  TXL  [9,  17],  Puma  [13],  OP- 
TIMIX  [5],  and  KHEPERA  [11].  All  these  systems  pro¬ 
vide  tree  transformation  languages  with  succinct  primitives 
for  matching  subtrees.  Most  of  these  languages  require 
tree  traversal  to  be  programmed  explicitly.  TXL  includes 
a  “searching”  version  of  the  match  operator  which  behaves 
like  an  application  of  our  topdown  strategy.  KHEPERA 
provides  a  built-in  construct  to  iterate  over  the  immediate 
children  of  a  node. 

Other  recently-proposed  optimization  frameworks  tend 
to  rely  on  general-purpose  languages  to  describe  transforma¬ 
tions.  Aspect-Oriented  Programming  [14]  advocates  the  use 
of  domain-specific  “aspect”  languages  to  describe  optimiza¬ 
tion  of  program  IR  trees;  however,  existing  examples  appear 
to  use  LISP  for  this  purpose.  Intentional  Programming  [2] 
provides  a  library  of  routines  for  manipulating  ASTs;  in 
principle,  these  routines  can  be  invoked  from  a  variety  of 
(intentional  representations  of)  languages,  but  the  current 
implementation  uses  C-style  programs. 

Strategies  First-order  algebraic  specification  formalisms 
such  as  ASF-FSDF  [10]  provide  a  fixed  strategy  for  normal¬ 
izing  terms  with  respect  to  a  set  of  rewrite  rules.  A  common 
work-around  to  implement  strategies  in  such  a  setting  is  to 
encode  a  strategy  into  the  rewrite  system  by  providing  an 
extra  outermost  constructor  that  determines  at  which  point 
in  the  term  a  rewrite  rule  can  be  applied. 

Originating  in  theorem  proving  tactics,  rewriting  strate¬ 
gies  were  introduced  in  the  algebraic  specification  languages 
ELAN  [7]  and  Maude  [8].  Maude  is  a  specification  formal¬ 
ism  based  on  rewriting  logic.  It  provides  equations  that  are 
interpreted  with  innermost  rewriting  and  labeled  rules  that 
are  used  with  an  outermost  strategy.  Strategies  for  apply¬ 
ing  labeled  rules  can  be  defined  in  Maude  itself  by  means  of 
reflection. 

ELAN  provides  a  built-in  strategy  language  similar  to 
the  one  in  this  paper.  The  strategy  language  described  in 
this  paper  is  a  generalization  of  the  language  of  ELAN.  The 
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imports  lib  list  subs  props 
signature 

sorts  TExp  Vdec  Fdec  Se  Exp 
operations 
Funtype 
Recordtype 
Primtype 
Vdec 
Fdec 
Const 
Var 

Simple 
Record 
Select 
Papp 
App 
Let 

Letrec 
rules 

Hoistl  :  Let (Vdec (t,  v,  Let(vdec,  el)),  e2)  ->  Let(vdec,  Let(Vdec(t,  v,  el),  e2)); 

Hoist2  :  Let (Vdec(t ,  v,  LetrecCf decs ,  el)),  e2)  ->  Letrec (fdecs ,  Let(Vdec(t,  v,  el),  e2)); 

Prop  :  Let (Vdec (t ,  v,  Simple(s)),  e[|  Var(v)  |])  ->  Let(Vdec(t,  v,  Simple(s)),  e[|  s  |]); 

Deadl  :  Let (Vdec (t,  v,  el),  e2)  ->  e2  where  <not(in)>  (v,  e2)  .  <pure>  el; 

Dead2  :  Letrec (fdecs ,  el)  ->  el  where  <map({f  :  match(Fdec L ,  f,  _,  _)).  <not(in)>  (f,  el)})>  fdecs; 

Inll  :  Letrec ( [Fdec (t,  f,  xs,  el)],  e2[|  App(Var(f),  ss)  [])  -> 

Letrec ( [Fdec (t,  f,  xs,  el)],  e2[|  <subs  .  rename>  (xs,  ss,  el)  I])  where  <small>  el; 

Inl2  :  Letrec ( [Fdec (t,  f,  xs,  el)],  e2[  App(Var(f),  ss)  ])  -> 

Letrec ( [Fdec (t,  f,  xs,  el)],  e2[  <subs  .  rename>  (xs,  ss,  el)  ]) 

where  <not(in)>  (Var(f),  el)  .  <not(in)>  (Var(f),  e2[Hole]); 

Sel  :  Let (Vdec (t,  v,  Record (ss)),  e[|  Select (i,  Simple (Var (v) ) )  |])  -> 

Let (Vdec (t,  v,  Record (ss)),  e[|  <index>  (i,  ss)  I]); 

EtaExp  :  Let (Vdec (Funtype (ts ,  t) ,  fl,  el),  e2)  ->  Letrec ( [Fdec (Funtype (ts ,  t) ,  fl,  xs, 

Let (Vdec (Funtype (ts ,  t) ,  f2,  el),  App (Var (f 2) ,  xs)))],  e2) 
where  <pure>  el  .  <new>  fl  =>  f2  .  <map(new  .  {x:  <x  ->  Var(x)>})>  ts  =>  xs 
strategies 


List (TExp)  *  TExp 

->  TExp  —  Type  expressions 

List (TExp) 

->  TExp 

String 

->  TExp 

TExp  *  String  *  Exp 

->  Vdec  —  Variable  declarations 

TExp  *  String  *  List (String)  *  Exp  ->  Fdec  —  Function  declarations 

TExp  *  String 

->  Se  —  Simple  expressions 

String 

->  Se 

Se 

->  Exp  —  Expressions 

List (Se) 

->  Exp 

Int  *  Se 

->  Exp 

String  *  List(Se) 

->  Exp 

Se  *  List(Se) 

->  Exp 

Vdec  *  Exp 

->  Exp 

List (Fdec)  *  Exp 

->  Exp 

groupl  =  Inll  +  Inl2  +  Sel  +  Prop 

group2  -  (Deadl  +  Dead2)  <+  (Hoistl  +  Hoist2) 

optl  =  innermost  * (Hoistl  +  Hoist2)  . 

somedownup( (groupl  .  repeat (Deadl  +  Dead2)  <+  repeatl (Deadl  +  Dead2))) 
optimizel  =  bottomup (try (EtaExp) )  .  repeat (optl) 

opt2  =  rec  x  .  (repeat (Hoistl)  .  try(Hoist2)  . 

try(  Let (id,  x)  .  try (Prop  +  Sel)  .  try (Deadl) 

+  Letrec (id,  x)  .  (Dead2  <+  try (Letrec (map (Fdec (id, id, id, x) ), id)  . 

try ((Inll  +  Inl2)  .  try(Dead2)  .  x)))) 

optimize 2  =  bottomup (try (EtaExp) )  .  opt3 


Table  7:  Specification  of  RML  transformation  rules 
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resulting  language  is  a  combination  of  ideas  from  the  process 
algebra  ACP  [6]  and  the  modal  mu-calculus  [15].  An  earlier 
version  or  our  language  was  described  in  [16].  Technical 
contributions  of  our  strategy  language  include  the  modal 
operators  □,  O  and  ❖that  enable  very  concise  specification 
of  term  traversal;  the  explicit  recursion  operator  fix.  s;  the 
refinement  of  rewrite  rules  into  match  and  build;  and  the 
encoding  of  complex  rewriting  features  into  strategies,  in 
particular  the  expression  of  rules  with  contexts. 

9  Conclusions 

\\V  have  illustrated  lunv  separating  transformation  rules 
from  the  application  strategy  can  promote  concise,  under¬ 
standable  descriptions  of  complex  rewriting  tasks.  Our  ex¬ 
ample  compiler  optimizer  takes  about  50  lines;  the  corre¬ 
sponding  handwritten  Standard  ML  code  is  several  hundred 
lines.  Moreover,  we  can  completely  alter  the  optimizer’s 
rewriting  strategy  by  changing  just  two  or  three  lines;  simi¬ 
lar  changes  to  the  ML  version  would  require  extensive  struc¬ 
tural  edits  throughout  the  code. 

Although  we  concentrate  on  program  optimizers  in  this 
paper,  we  believe  that  the  techniques  are  equally  well  appli¬ 
cable  in  other  areas  where  source  to  source  transformations 
are  used,  including  simplification,  typechecking,  interpreta¬ 
tion  and  software  renovation. 
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sions  that  started  our  work  on  strategies.  Several  ideas  that 
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erably  by  the  use  of  Tim  Sheard’s  programs  for  generation 
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A  User-defined  RML  Predicates 
(*  file:  props. r 

-  Some  properties  of  RML  expressions  *) 

rules 

Bindl  :  Let(Vdec(t,  v,  e) ,  e’)  ->  [v] ; 

Bind2  :  Letrec(fdecs ,  e)  -> 

<map({f:  <Fdec(_,  f.  „)  ->  f>})  >  fdecs 
Bind3  :  Fdec(_,.,xs,_)  ->  xs 

strategies 

rmlrename  «  rename (Bindl  +  Bind2  +  Bind3) 

small  *  Simple (id)  +  Record(id)  +  Select (id,  id)  + 
Papp(id,  id)  +  App(id,  id) 

pure  *  not  (oncebu  (match  (PappC  as  sign" ,  _)))) 

B  Generic  Strategies 

In  this  and  the  next  appendices  we  present  three  sets  of 
generally  applicable  strategy  operators.  Note  that  all,  one, 
and  some  stands  for  □,  O,  and  <S>. 

(*  file:  lib.r  -  Standard  strategies  *) 

strategies 

(*  Try  s  *) 

try(s)  =  s  <+  id 

(♦  Repetition  *) 

repeat (s)  =  rec  x  .  ((s  .  x)  <+  id) 

repeatl(s)  -  s  .  repeat (s) 

(*  Traversal;  all  s  applications  have 
to  succeed  *) 

bottomup(s)  =  rec  x  .  (all(x)  .  s) 
topdown(s)  =  rec  x  .  (s  .  all(x)) 
downup(s)  =  rec  x  .  (s  .  all(x)  .  s) 

downup2(sl,  s2)  =  rec  x  .  (si  .  all(x)  .  s2) 

(*  Traversal;  one  s  application 
has  to  succeed  *) 

oncebu(s)  =  rec  x  .  (one(x)  <+  s) 
oncetd(s)  -  rec  x  .  (s  <+  one(x)) 

(*  Greedy  traversal;  apply  s  as  often  as  possible 
and  at  least  once.  *) 

somebu(s)  =  rec  x  .  (some(x)  <+  s) 

sometd(s)  =  rec  x  .  (s  .  all(s  <+id)  <+  some(x)) 

(*  Greedier  *) 

somedovnup(s)  =  rec  x  .  ((s  .  (all(x)  .  (s  <+  id) 
<+  id))  <+  (some(x)  .  (s  <+  id))) 


(*  Normalization  strategies  *) 

reduce(s)  =  repeat (rec  x  .  (some(x)  +  s)) 

outermost (s)  =  repeat (oncetd(s) ) 

innermost (s)  =  repeat (oncebu(s)) 

innermost  * (s)  =  rec  x  .  (all(x)  .  (s  .  x  <+  id)) 

C  Lists  and  Pairs 

Lists  axe  constructed  with  the  polymorphic  construc¬ 
tors  Cons  and  Nil.  Finite  lists  can  be  constructed 
with  the  special  notation  [ii,. . .  ,tn]>  abbreviating 
Cons(ti,...  Cons(tn,  Nil)).  Lists  have  type  List(A) 
with  A  some  type.  Tuples  (£i,...,tn)  have  type 
Prod([Ai, . . .  ,  An]),  where  Ai  is  the  type  of  U. 

(*  file:  list.r  *) 
signature 
operations 

Zip  :  Prod( [List (A) ,  List(B)]) 

->  List (Prod ( [A,  B])) 

rules 

Hd  :  Cons(x,l)  ->  x; 

T1  :  Cons(x,l)  ->  1; 

Fst  :  (x,  y)  ->  x; 

Snd  :  (x,  y)  ->  y; 

Zipl  :  Zip(Nil,  Nil)  ->  Nil; 

Zip2  :  Zip (Cons (x,  xs) ,  Cons(y,  ys))  -> 

Cons((x,  y),  Zip(xs,  ys)); 

Indl  :  (1,  Cons(x,  xs))  ->  x; 

Ind2  :  (n,  Cons(x,  xs))  ->  (n-1,  xs)  where  geq(n,2) 
strategies 

(*  Evaluation  strategies  *) 

zip(s)  =  rec  x  .  (Zipl  +  Zip2  .  Cons(s,  x)) 
index  =  repeat (Ind2)  .  Indl 

(*  Concatenation  *) 

cone  =  {1:  match((l,  _))  .  Snd  . 

rec  x.  (ConsCid,  x)  <+  build(l))} 

(*  Find  first  list  element  for  which  s  succeeds  *) 

fetch(s)  -  rec  x  .  (Cons(s,  id)  <+  Cons(id,  x) ) 

(*  Apply  strategy  to  each  element  of  a  list  *) 

map(s)  =  rec  x  .  (Nil  +  Cons(s,  x)) 

D  Substitution  and  Renaming 

(*  file:  subs.r  *) 

strategies 

(*  Test  occurrence  of  a  in  b  *) 

in  =  {a:  match((a,  _))  .  Snd  .  oncebu(match(a) )> 
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(*  Substitution  *) 


subs  =  {1st,  xs,  ss,  t: 

match((xs,  ss,  t))  . 

<zip(id)>  Zip(xs,  ss)  ->  1st  . 

<topdown(<Var (x)  ->  z 

where  <fetch(match(x ,  z))>  lst>  <+  id)>  t> 

(*  Renaming  *) 
rules 

Init  :  t  ->  (t,  []) 

Fresh  :  x  ->  (x,  <new>  x) 

Ren  :  (x,  1)  ->  z  where  <fetch(match((x,  z)))>  1 
strategies 

binds(s)  =  {t,  1:  <(t,  1)  -> 

(t,  <conc>  (<s  .  map (Fresh) >  t,  1))>} 

dist(s)  =  {1,  t:  <(t,  1)  ->  t>  . 

all({x  :  <x  ->  (x,  1)>>  .  s)} 
rename (s)  *  Init  . 

rec  x  .  (Ren  <+  ( (binds (s)  <+  id)  .  dist(x))) 
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